{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Naive Bayes Simple Male or Female"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "author: Nicholas Farn [<a href=\"sendto:nicholasfarn@gmail.com\">nicholasfarn@gmail.com</a>]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This example shows how to create a simple Gaussian Naive Bayes Classifier using pomegranate. In this example we will be given a set of data measuring a person's height (feet) and try to classify them as male or female. This example is a simplification drawn from the example in the Wikipedia <a href=\"https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Examples\">article</a> on Naive Bayes Classifiers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Populating the interactive namespace from numpy and matplotlib\n"
     ]
    }
   ],
   "source": [
    "from pomegranate import *\n",
    "import seaborn\n",
    "seaborn.set_style('whitegrid')\n",
    "%pylab inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First we'll create the distributions for our model. In this case we'll assume that height, weight, and foot size are normally distributed. We'll fit our distribution to a set of data for males and females."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "male = NormalDistribution.from_samples([ 6.0, 5.92, 5.58, 5.92, 6.08, 5.83 ])\n",
    "female = NormalDistribution.from_samples([ 5.0, 5.5, 5.42, 5.75, 5.17, 5.0 ])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check on the parameters for our male and female height distributions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEECAYAAADDOvgIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO3de5hU1Znv8W81FE2DoMZAE1G8+x5yGBVhFAOEqxckikbHeIkxinc4E40aFM0APoOZmYMkeNR4l4ljoraItxwjiQgimJh0UMOxfIkY5YQMoD0qKN1N3+aP2t1W965quqHu9fs8D4+71l676l12735rrbX32pGWlhZEREQSleU6ABERyT9KDiIiEqLkICIiIUoOIiISouQgIiIhPXMdQDq88cYbLeXl5bkOI+Pq6+sphXYmUptLg9qcGzt27PhoxIgRA5LtK4rkUF5eztChQ3MdRsbFYrGSaGcitbk0qM25UV1d/UGqfRpWEhGRECUHEREJUXIQEZEQJQcREQlRchARkZCiuFppV7Zt28bWrVtpaGjIdSh7pKGhgVgsluswuiwajTJw4ED69++f61BEpJuKPjls27aNLVu2MHjwYCoqKohEIrkOabfV1tZSUVGR6zC6pKWlhdraWjZt2gSgBCFSYIp+WGnr1q0MHjyYPn36FHRiKDSRSIQ+ffowePBgtm7dmutwRKSbij45NDQ0FMy37WJUUVFR8MN5IqUoI8NKZvZd4LvBy97AMcB4YBHQCCxz93lmVgbcDRwN1AOXuvu7ZjaqY909iUc9htwp1P/3gxYMYsvnW5LuK6OMZppD5ZV9K9l8/eZMhyaSFRnpObj7Yncf7+7jgWrgH4F7gPOBMcDxZjYcOAPo7e4nADcCtwdvkayuSNakSgxA0sSwq2NECk1Gh5XMbCTwP4HHgHJ33+DuLcCLwGTif/x/BeDuvwVGmln/FHVFRCRLMn210mxgHtAf2JZQvh04NCj/NKG8qZO6KdXX16e8xLOhoYHa2tpQ+cF//CNbszgWPjAa5f1jj+1y/WOOOQaA559/ngMOOAD44gqgqqoq5s+fz6WXXsrMmTN3+V7Tp09n+PDhXaqbCXtyCW5dXV1BXb4bmRceRtuvfD9WTVvV5fcotDang9qcfzKWHMxsH8Dc/eWgN9AvYXc/4BOgT4fyMuKJIVndlDpblTUWiyWdkM5mYmj9vO5OjEejUdasWcNFF10EfHEp68qVK4lEIvTs2bNL79mjR48u182EaDS626tP5sPKlXuqpr6mW20ohjZ3l9qcG9XV1Sn3ZbLn8HXgJQB332ZmO83sMOA94GTiPYoDgNOAJ4JJ6D91UrfkjBw5kuXLl7clB4DPPvuMtWvX8tWvfjWHkRWPziaeRUpZJuccjPgf91ZXAo8CrwNr3f13wFKgzszWAD8Gru2kbsmZNGkS1dXVbN++va1s5cqVjBw5kr59+7are9999zFp0iSGDRvGmDFjWLRoUcr3ffzxx5k0aRLDhw/nvPPO46233spYG/KdEkP6DFq9msiKFURWrGDQ6tW5Dkf2UMZ6Du7+vzu8/i0wqkNZM/FE0PHYUN1SdNhhhzF48GBeeeUVpk6dCsBLL73E5MmTee6559rqPfPMMzz00EP8+Mc/5sADD2TVqlXMnTuXCRMmcNRRR7V7z+XLl7No0SJuvfVWDj/8cF544QUuuugiXnzxRQYOHJjV9klx2ZIwVLtF97YUvKK/Ca7QTZw4keXLlwPxid1XX32VSZMmtatTWVnJj370I0444QQOOOAAzjvvPAYMGMCf//zn0Ps98MADXH755UyePJmDDz6Yq666imHDhlFVVZWV9kjpUC+isBX92kqFbtKkSVx99dU0Njby+9//nsMPP5z99tuvXZ1Ro0bx5ptvcvvtt7NhwwZisRgffvghzc3h6/E3bNjAwoUL2w077dy5k0GDBmW8LVKa1IsoTEoOee7YY4+lR48eVFdX8/LLL3PiiSeG6lRVVXHbbbdx9tlnc9JJJzFr1iy+853vJH2/pqYmZs2axZgxY9qV9+nTJyPxS1yyS1x1R7XkMyWHPFdWVsb48eNZvnw5r7zyCldccUWozi9+8QuuvPLKtn3btm2jpqaGlpaWUN1DDjmEzZs3c9BBB7WVzZkzh+OOO65tXkOyoxgmwwetXq2eQZHSnEMBmDRpElVVVey9994ceOCBof377rsvr732Gu+99x7r1q3j2muvpaGhgZ07d4bqXnzxxTzyyCMsXbqUjRs3cuedd7JkyRIOPbTT+wxFklJiKF4l23OojEaz+otdGY3u9rGjR4+mqamJCRMmJN0/e/Zsbr75Zs4880z23XdfpkyZQt++fXn77bdDdU899VRqamq488472bp1K4ceeih33XVXzm/GEZH8UrLJYfPo0bkOoVPu3rZdUVHBm2++2W4ZkEceeaRt+7DDDuOxxx5L+V6JdQEuvPBCLrzwwjRGKyLFRsNKIiISUrI9BxHJnsiKFW3bldFo3vfcRclBRLohHVcnaRK7MGhYSUS6TH/YS4eSg4iIhGhYSUqCluYW6R71HKQkKDGIdI+Sg4iIhCg5iIhISMnOOWR7DLq7K3BOnDiRTZs2hcqPOOIInn/++XSG1qkbb7yRxsZGFixYkLXPFJHcK9nkkO0x6N35vBtvvJFvfOMbba/r6urYa6+90hmWiEhSJZscCsFee+3FgAED2l7X1tZSUVGRw4hEpFRozqFAPf7440yaNInhw4dz3nnn8dZbb7XtmzhxIo8//jhnnXUWRx11FNOnT2fTpk3MnDmTo48+mjPOOIMNGza01V+yZAlTpkxh2LBhHH/88cyZM4fGxsakn/ub3/yGqVOncvTRR3PmmWfyyiuvZLytIpJ9Sg4FaPny5SxatIibbrqJpUuX8vWvf52LLrqIrVu3ttW54447+P73v8+jjz7KunXrOPPMMxk7dixVVVWUlZXxk5/8BIA//OEPzJs3j2uvvZYXX3yRefPm8dRTT7Fs2bLQ577zzjvccMMNXHbZZTz33HOcc845zJw5k1gslrW2F5vIvEjo39hnxuY6LJHMDSuZ2U3A6UAv4G5gJbAYaAHWATPcvdnM5gBTgUbgGnd/3cwOT1Y3U7Hmq1tvvZXbbrut7XVLSwsvvfQSDzzwAJdffjmTJ08G4KqrrmLNmjVUVVUxY8YMAKZNm8boYHGz4447jo8//phvfetbAJx++ulUVVUB0Lt3b+bPn89JJ50EwODBg3n44Yd59913Q/E8+OCDnHXWWZxxxhkADBkyhLfeeotHHnmkXZyyZ2rqa3IdgkhmkoOZjQe+BowG+gDXAwuBW9x9hZndA0wzsw+AccDxwIHAEuDvk9UFlmYi1nw2c+ZMTjnllLbXdXV17LPPPmzYsIGFCxeyaNGitn07d+5k0KBBba8TnxhXXl7O/vvv3+5161Pihg0bRu/evbnjjjt49913cXc++OADRo0aFYpnw4YNrF+/niVLlrSVNTQ0cNRRR6WnwSKSNzLVczgZ+BPxP+j9gRuAy4j3HgBeAE4CHFjm7i3ARjPraWYDgBFJ6pZccvjSl77U7lnPtbW19OjRg6amJmbNmsWYMWPa1e/Tp0/bds+e7X+0ZWXJRxBXrVrF1VdfzRlnnMHYsWOZMWMG8+bNS1q3qamJ6dOn881vfrNdea9evbrVLhHJf5lKDl8GDgK+ARwCPAuUBUkAYDuwN/HEkdiHbi2PJKmbUn19fcpx74aGhnZPUMul7sTR3NzMzp072x3T0tJCbW0tBx10EH/9618ZOHBg27758+czYsQITjnllNCxTU1N7T6/oaGh7b0ee+wxTjvtNGbPng1AY2MjGzdu5JhjjqG2tpbGxkaampqora1lyJAhvP/+++0+9+6772afffbh/PPPT9mWhoaG3Z6XqKurK8k5jWJvc8f2leLPOd/bnKnkUAO84+47ATezOuLDRq36AZ8A24LtjuXNScpSKi8vT/kM5FgsljeXf3YnjrKyMnr16tXumNZLWadPn87s2bM58sgjGTFiBM8++yxPP/00F1xwARUVFaFje/ToQc+ePdteR6NRIpEIFRUV7Lfffqxdu5YPPviAHj16cO+99/Lhhx/S0tJCRUVFWw+k9XPPP/98hg8fzoQJE1izZg0PPvggP/3pTzttWzQa3e1nVMdisZJ8vnU+tTkdz3DoqGP7SvHnnA9trq6uTrkvU1crvQqcYmYRM9sf6Au8FMxFAEwBVgGrgZPNrMzMhhDvXXwErE1SN60q+1am+y2z9nmnnnoq1113HXfeeSdTp07l17/+NXfddddu/aLNnDmTgQMHcu6553LxxRcTjUa54IILePvtt0N1jznmGBYsWMATTzzB1KlTWbx4Mbfddhvjxo1LR7MkT+kZDqUp0tLSsutau8HM/g2YQDwBzQb+AtxP/OqlGHCZuzeZ2VziCaAMuNbdXzWzI5PVTfVZsVispbOeQ66zc7oU6k1we/IzSNfPLzIvssfvkU0tczJzXu6OxEd8plvrI0OL6Tztqnxoc3V1dfWIESNGJtuXsUtZ3f0HSYpDXzHdfS4wt0PZ+mR1RaS4qFeSv3QTnIiIhGhtJSkqeuKbSHqo5yBFRYlBJD1KIjlkatJddk3/70UKU9Enh2g0mjc3wZWi2tpaotForsMQkW4q+uQwcOBANm3axI4dO/QtNotaWlrYsWMHmzZtandHtYgUhqKfkO7fvz8Af/vb32go8MvmGhoaCupbeDQapbKysu1nICKFo+iTA8QTRDH8gcqHm2ZEpDQU/bCSiIh0n5KDiIiElMSwkojkr9a1myr/67/YHDy9UHJPPQcRyQtaZym/qOcgIiGZeIaDFBb1HEQkRIlBlBxERCREw0oieSjZw4kq+1ay+frNOYhGSpF6DiIFQivOSjYpOYiISIiSg4iIhCg5iIhIiJKDiIiEZOxqJTP7I7AtePkX4F5gEdAILHP3eWZWBtwNHA3UA5e6+7tmNqpj3UzFKSIiYRlJDmbWG4i4+/iEsjeAs4D3gF+a2XDgEKC3u58QJITbgWnAPR3ruvvaTMQqIiJhmeo5HA30MbNlwWfMBcrdfQOAmb0ITAa+AvwKwN1/a2Yjzax/irpKDiIiWZKp5LADWAA8ABwBvAB8krB/O3Ao0B/4NKG8KSjblqRuSvX19cRisT2POs/V1dWVRDsTpWrz2GfGUlNfk4OIcqvYf/7F3r5E+X4+Zyo5rAfedfcWYL2ZfQp8KWF/P+LJok+w3aqMeGLol6RuSuXl5SXxhLRSfBJcqjbXPFF6iQHI3s9/S25uuCul3+98OJ+rq6tT7svU1UqXEJ8/wMz2J54EPjezw8wsApwMrAJWA6cG9UYBf3L3bcDOJHVFRCRLMtVzeBBYbGavAi3Ek0Uz8CjQg/gVSL8zs98DJ5rZGiACXBwcf2XHuhmKU0REkshIcnD3ncD5SXaN6lCvmXgi6Hj8bzvWFZHM0fMbpCPdBCciSgwSoiW7RSRvtD5PGqAyGtUzpXNIPQcRyUvqzeSWkoOIiIQoOYiISIiSg4iIhCg5iIhIiK5WEikgkXmRUFll30o2X785B9FIMVPPQaTAbfk8N+sgSXFTchARkRAlBxERCVFyEBGRECUHEREJUXIQEZEQXcoqeW3sM2NL9qlvIrmknoPktVJ8TrRIPlByEBGRECUHEREJUXIQEZEQJQcREQlRchARkZAuJQczu6XD6x9lJhwRyZZBq1cTWbGi3XObRVp1ep+DmU0HLgWGmtmpQXEPIArctItjBwLVwIlAI7AYaAHWATPcvdnM5gBTg/3XuPvrZnZ4srq71ToRSUnPaJbO7Krn8B/AecATwX/PA84GTujsIDOLAvcCtUHRQuAWdx8LRIBpZnYsMA44HjgXuCtV3W62SURE9lCnPQd3rwfeN7MrgZFA72DXIcArnRy6ALiHL3oXI4CVwfYLwEmAA8vcvQXYaGY9zWxAirpLO4uzvr6eWCzWWZWiUFdXVxLtlO4r1t+L1iGv/crKWDVgQG6DSbN8P5+7unzGk8BA4P8Hr1tIkRzM7LvAh+7+opm1JodIkAQAtgN7A/2BxNtfW8uT1e1UeXk5Q4cO7WJTClcsFiuJdkr37dbvxZbCeUhQTXNz0f3u58P5XF1dnXJfV5PDIHf/WhfrXgK0mNlk4BjgZ8QTS6t+wCfAtmC7Y3lzkjIREcmirl7K+o6Z7d+Viu7+dXcf5+7jgTeA7wAvmNn4oMoUYBWwGjjZzMrMbAhQ5u4fAWuT1BURkSzqas9hLPF5gQ+D1y3u3qVkEbgOuN/MegEx4El3bzKzVcBrxJPUjFR1u/E5IiKSBl1KDu5+xO68edB7aDUuyf65wNwOZeuT1RURkezpUnIws4eJT0K3cfdLMhKRiIjkXFeHlR4L/hsBjgW6M6QkIhkWmRcJlVX2rWTz9ZtzEI0Ug64OK72Y8PJXZrYsQ/GISJps+bxwLlWV/NPVYaWTEl5+BajMTDgiIpIPujqsdF7Cdh3xexlERKRIdXVY6WIzGwZ8FVjv7m9kNiwREcmlrg4r/S/gfOB3wPVm9oS7L8hoZFJSBi0YpDFykTzS1TukzwfGuvs1wGjgW5kLSUqREoNIfulqcoi4eyOAuzcAWgheRKSIdXVC+lUze5L4OkdjiK+LJCIiRWqXPQczu5z4cxkeJr589kp3vyHTgYmISO50mhzMbC7xh+1E3f2XxJffnmhmP8xCbCIikiO76jlMAf7B3XcAuPv7xCejT89wXCIikkO7Sg6fJTyVDWibkN6euZBERCTXdpUcas3s0MSC4HVLivoiIhkRWbGCyIoVDFqt62GyYVdXK80Cnjazl4D3gCHAycBFmQ5MRNJv0OrVbGko7CvRCz3+QtFpz8Hd/x/xp8CtBfoCfwRGu/vaLMQmImmmP6zSVbu8z8HdPyV+lZKIiJSIrt4hLSIiJUTJQUREQpQcREQkpKtrK3WLmfUA7geM+GWvVxJ/SNDi4PU6YIa7N5vZHGAq0Ahc4+6vm9nhyepmIlYREQnLVM/hNAB3Hw3cAswHFgK3uPtYIAJMM7NjgXHA8cC5wF3B8aG6GYpTRESSyEjPwd2fNrPng5cHAZ8Ak4GVQdkLxNdscmBZcBf2RjPraWYDgBFJ6i5N9Xn19fXEYrH0NyTP1NXVlUQ7JX2K9felGNqV7+dzRpIDgLs3mtm/A2cCZwMnJizFsZ34Cq/9gZqEw1rLI0nqplReXs7QoUPTGX5eisViJdFOSZ/Q78uW4nioUjGcB/lwPldXV6fcl9EJaXe/CDiS+PxDRcKufsR7E9uC7Y7lzUnKREQkSzKSHMzsQjO7KXi5g/gf+z+Y2figbArxBwetBk42szIzGwKUuftHwNokdUVEJEsyNaz0FPCwmb0CRIFrgBhwv5n1CrafdPcmM1sFvEY8Uc0Ijr+uY90MxSlS1CLzIuHC6L7wtaeyH4wUlExNSH8OnJNk17gkdecCczuUrU9WV0TSoOHjXEcgBSBjE9IiyQxaMIgtnxfHpKhIMdMd0pJVSgwihUHJQUREQpQcREQkRMlBRERClBxERCREVyuJSMGJrFjRtl0ZjbJ59OjcBVOk1HMQkYKm52JnhpKDiIiEKDmIiEiIkoOIiIQoOYiISIiSg4iIhCg5iIhIiJKDiIiE6CY4kVK0ckK4TA8BkgTqOYhInB4CJAmUHEREJETJQUREQpQcREQkRBPSkhF6VrRIYUt7cjCzKPAQcDBQDvwz8DawGGgB1gEz3L3ZzOYAU4FG4Bp3f93MDk9WN91xSmYpMYgUtkwMK30bqHH3scApwJ3AQuCWoCwCTDOzY4FxwPHAucBdwfGhuhmIUUREOpGJYaUq4MlgO0K8VzACWBmUvQCcBDiwzN1bgI1m1tPMBqSou7SzD6yvrycWi6W1Efmorq6uJNop0l2tD//Zr6yMVQMG5DaYLsr38zntycHdPwMws37Ek8QtwIIgCQBsB/YG+gM1CYe2lkeS1O1UeXk5Q4cOTU8D8lgsFiuJdorsrprm5oI5R/LhfK6urk65LyNXK5nZgcDLwCPu/nMgcc6gH/AJsC3Y7lierK6IiGRR2pODmVUCy4BZ7v5QULzWzMYH21OAVcBq4GQzKzOzIUCZu3+Uoq6IiGRRJuYcZgP7Aj80sx8GZd8D7jCzXkAMeNLdm8xsFfAa8SQ1I6h7HXB/Yt0MxCgiIp3IxJzD94gng47GJak7F5jboWx9sroiIpI9ukNaRERClBxERCREyUFEREKUHEREJEQL74nIF/SEOAmo5yAindMT4kqSkoOIiIQoOYiISIjmHGSP6KE+IsVJyUH2iBKD5JvW5bsro1E2jx6d22AKmIaVRKQobWloyHUIBU3JQUREQpQcREQkRMlBRERClBxERCREyUFEREKUHEREJETJQUREQnQTnIjsWrLVWkErthYx9RxEZPdpxdaipZ6DdInWUJJC1LqUBmg5je7KWHIws+OBf3X38WZ2OLAYaAHWATPcvdnM5gBTgUbgGnd/PVXdTMUpXaPEIIVOy2l0T0aGlczsB8ADQO+gaCFwi7uPBSLANDM7FhgHHA+cC9yVqm4mYhQRkdQyNeewAfhmwusRwMpg+wVgMjAGWObuLe6+EehpZgNS1BURkSzKyLCSuy8xs4MTiiLu3hJsbwf2BvoDNQl1WsuT1e1UfX09sVhsj+POd3V1dSXRTpFMyafzJ9/P52xNSCfOGfQDPgG2Bdsdy5PV7VR5eTlDhw5NQ5j5LRaLlUQ7RTIln86ffDifq6urU+7L1qWsa81sfLA9BVgFrAZONrMyMxsClLn7Rynqiki+Wjkh/G/NN3d9nOS1bPUcrgPuN7NeQAx40t2bzGwV8BrxJDUjVd0sxSgi6aL7HwpexpKDu78PjAq21xO/MqljnbnA3A5lSeuKiEj26A5pEREJ0R3S0o7uhJZi1nrHtO6W3jX1HKQdJQYpBbpbeteUHEREJETJQUREQpQcREQkRBPSIpIZyR4QpIcDFQwlBxHJnjy6OU5XLnVOw0oiUtJ05VJy6jmUIN3LICK7op5DCVJiEJFdUXIQEZEQDSuJSHbpKqaCoJ6DiOReHl3FJHHqORQxTTyLdI0uaw1Tz6GIKTGIdI8ua/2CkoOIiIRoWKkIaPhIikKeTFS3DjFBaQ8zqedQBJQYpGjleKK6lIeZlBxERCREw0oFZOwzY6l5oibXYYhkV46Hm0r1Sqa8TA5mVgbcDRwN1AOXuvu7uY0qezSHILILDR9nPWlsaWgoqUSRl8kBOAPo7e4nmNko4HZgWo5jyholBpHdlKWkkZgooDiTRb4mhzHArwDc/bdmNjLH8XRJqm/8ZZTRTHOXy0UkzVIlDSJAS7i4m8kkMVmUQdtZXchJI1+TQ3/g04TXTWbW090bk1XesWPHR9XV1R9kJ7TUfjnhl7kOQUTyTHV19W7ty5KDUu3I1+SwDeiX8LosVWIAGDFixIDMhyQiUjry9VLW1cCpAMGcw59yG46ISGnJ157DUuBEM1tDfFDw4hzHIyJSUiItLUkmY0REpKTl67CSiIjkkJKDiIiEKDmIiEhIvk5IlzwzGwhUAye6+zsJ5dcClwIfBkVXuLvnIMS0M7M/Er+MGeAv7n5xwr7LgCuARuCf3f35HISYdrto8yLiN4RuD4qmufunFDAzuwk4HegF3O3uDybsOw34J+I/44fc/f7cRJleu2hz3p7PSg55yMyiwL1AbZLdI4DvuHvO755JJzPrDUTcfXySfYOAfwRGAr2BV83s1+5en90o06uzNgdGACe7+0fZiypzzGw88DVgNNAHuD5hXxT4MfD3wOfAajN71t0Lei2ZztocyNvzWcNK+WkBcA/wtyT7RgA3mdmrwTeSYnE00MfMlpnZ8uD+llbHAavdvT745vwucFROokyvlG0OFp88ArjPzFab2SU5izJ9TiZ+z9JS4Dkgsfc3FHjX3T92953Aq8DXsx9i2nXWZsjj81nJIc+Y2XeBD939xRRVHgOuBCYCY8zsG9mKLcN2EE+KJxNv36Nm1tqz7bicynZg7+yGlxGdtbkv8H+AbwOnAFebWaEnxC8T7/39A1+0NxLsK9afcWdthjw+n5Uc8s8lxG8AXAEcA/wsGFYh+KX6ibt/FHy7+iUwPGeRptd64D/cvcXd1wM1wFeCfR2XU+kHfJLl+DKhszbvABa5+w533w4sJ97TKGQ1wIvuvjMYV68DWpe+Kdafcco25/v5rDmHPOPubV3pIEFc6e6bg6L+wDozG0p8XHYi8FDWg8yMS4C/I/4NeX/ibf3PYN/rwPxgjL6c+BDEupxEmV6dtflI4HEzG078S9wY4N9zEmX6vAp8z8wWEk+CfYn/8QSIAUeY2ZeAz4gPKS3ISZTp1Vmb8/p81h3Seaw1OQDHAnu5+31mdiHxydl64CV3n5PDENPGzHoBi4EhxNdQngWMIj4O/WxwtdLlxP9Q3ubuS3IVa7p0oc03AOcADcDP3P2eXMWaLmb2b8AE4j/H2cB+fPG73Xq1Uhnxq5Xuyl2k6bOLNuft+azkICIiIZpzEBGRECUHEREJUXIQEZEQJQcREQlRchARkRDd5yASCNbBudLdz00o+xfgHXdfnOKYnwAL3X1jiv3vA//D3esSynoD33b3B5LUv4D4mlrPAr8mfl/HacTXWPq5mU0B9k9cvE0kE9RzENkD7n5NqsTQiUHEV+Jsx8z6El+E7Slgf6C/u3+N+I1ypwef9wJwtpn137PIRTqnnoNIF5nZj4CxQA/ivYWqhBsVPwJ+TvybvgMT3f3w4NCfmtkhwfaZwM3AV83sn9z91oSPuABYFmzfQ/yO4XuBQ4Gjzexyd78P+L/Ad4E7MtNSEfUcRDqaaGYrWv8B5wMEwzmHuPsY4ne73mxm+yQcdzPwtLuPA6po/8XrwWBZ7veBE4H5wNsdEgPAeOCtYPvqoM4VQf3lQWIgqDN+z5sqkpp6DiLtLU8y5wDxoZ0RQcIAiAIHJxw3lC/WPlrV4T1b1+rfTHxN/1S+DHTl+QX/SXwJBpGMUc9BpGveAV4OegATgSeADQn71wEnBNuj2h9KxzVqmkl+7m0F9klS3rH+vkFdkYxRchDpmueAz8xsFfGeQGmbjWkAAACjSURBVEuwlHarfwFON7OXgcuIL5aXylagl5n9a4fyFcDxSepvAP7OzK4JXh8PvNT9Joh0nRbeE0kDMzuV+EOafm9mk4HZ7j6xm+/Rj/i8xaRd1PsVcI67b+usnsieUM9BJD3+AtwR9CxuBX7Q3TcIeiI/M7OzUtUxs6nAEiUGyTT1HEREJEQ9BxERCVFyEBGRECUHEREJUXIQEZEQJQcREQn5bzWfpsPXE/EuAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Male distribution has mu = 5.89 and sigma = 0.158\n",
      "Female distribution has mu = 5.31 and sigma = 0.275\n"
     ]
    }
   ],
   "source": [
    "male.plot( n=100000, edgecolor='c', color='c', bins=50, label='Male' )\n",
    "female.plot( n=100000, edgecolor='g', color='g', bins=50, label='Female' )\n",
    "plt.legend( fontsize=14 )\n",
    "plt.ylabel('Count')\n",
    "plt.xlabel('Height (ft)')\n",
    "plt.show()\n",
    "\n",
    "print(\"Male distribution has mu = {:.3} and sigma = {:.3}\".format( *male.parameters ))\n",
    "print(\"Female distribution has mu = {:.3} and sigma = {:.3}\".format( *female.parameters ))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Everything seems to look good so let's create our Naive Bayes Classifier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "clf = NaiveBayes([ male, female ])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a look at how our classifier calls people of various heights. We can either look at a probabilistic measurement of the sample being male or female, or a hard call prediction. Lets take a look at both."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Height [5.], 4.3856389353194124e-05 chance male and 99.99995614361063 chance female\n",
      "Height [6.], 97.02328468247669 chance male and 2.976715317523313 chance female\n",
      "Height [4.92], 3.2247855094331937e-06 chance male and 99.99999677521448 chance female\n",
      "Height [5.5], 9.788762297712132 chance male and 90.21123770228787 chance female\n"
     ]
    }
   ],
   "source": [
    "data = np.array([[5.0], [6.0], [4.92], [5.5]])\n",
    "\n",
    "for sample, probability in zip( data, clf.predict_proba(data) ):\n",
    "    print(\"Height {}, {} chance male and {} chance female\".format( sample, 100*probability[0], 100*probability[1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Person with height [5.] is female.\n",
      "Person with height [6.] is male.\n",
      "Person with height [4.92] is female.\n",
      "Person with height [5.5] is female.\n"
     ]
    }
   ],
   "source": [
    "for sample, result in zip( data, clf.predict( data )):\n",
    "    print(\"Person with height {} is {}.\".format( sample, \"female\" if result else \"male\" ))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These results look good. We can also train a our classifier with a set of data. This is done by creating a set of observations along with a set with the corresponding correct classification."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{\n",
       "    \"class\" : \"NaiveBayes\",\n",
       "    \"models\" : [\n",
       "        {\n",
       "            \"class\" : \"Distribution\",\n",
       "            \"name\" : \"NormalDistribution\",\n",
       "            \"parameters\" : [\n",
       "                176.25,\n",
       "                9.60143218483576\n",
       "            ],\n",
       "            \"frozen\" : false\n",
       "        },\n",
       "        {\n",
       "            \"class\" : \"Distribution\",\n",
       "            \"name\" : \"NormalDistribution\",\n",
       "            \"parameters\" : [\n",
       "                132.5,\n",
       "                20.463381929681123\n",
       "            ],\n",
       "            \"frozen\" : false\n",
       "        }\n",
       "    ],\n",
       "    \"weights\" : [\n",
       "        0.5,\n",
       "        0.5\n",
       "    ]\n",
       "}"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = np.array([[180], [190], [170], [165], [100], [150], [130], [150]])\n",
    "y = np.array([ 0, 0, 0, 0, 1, 1, 1, 1 ])\n",
    "\n",
    "clf.fit( X, y )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case we fitted the normal distributions to fit a set of data with male an female weights (lbs). Let's check the results with the following data set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.array([[130], [200], [100], [162], [145]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's enter it into our classifier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Person with weight [130] is female.\n",
      "Person with weight [200] is male.\n",
      "Person with weight [100] is female.\n",
      "Person with weight [162] is male.\n",
      "Person with weight [145] is female.\n"
     ]
    }
   ],
   "source": [
    "for sample, result in zip( data, clf.predict( data )):\n",
    "    print(\"Person with weight {} is {}.\".format( sample, \"female\" if result else \"male\" ))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Everything looks good from here. In this tutorial we created a simple Naive Bayes Classifier with normal distributions. It is possible to create a classifier with more complex distributions, or even with a Hidden Markov Model."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
